---
title: Stemming
---
import { Aside } from '@astrojs/starlight/components';


Orama can analyze the input and perform a `stemming` operation, which allows the engine to perform more optimized queries, as well as save indexing space.

<Aside type="note" title='What is stemming?'>
In linguistic morphology and information retrieval, stemming is the process of reducing inflected (or sometimes derived) words to their word stem, base, or root form—generally a written word form. The stem need not be identical to the morphological root of the word; it is usually sufficient that related words map to the same stem, even if this stem is not in itself a valid root. Algorithms for stemming have been studied in computer science since the 1960s. Many search engines treat words with the same stem as synonyms as a kind of query expansion, a process called conflation.

Read more: [_Wikipedia_](https://en.wikipedia.org/wiki/Stemming)
</Aside>

<Aside type="caution">
Note that as of Orama 1.0.0 only the English stemmer is shipped with Orama. Other languages are published in the `@orama/stemmers` package, which must be installed manually.
</Aside>

When stemming is enabled, Orama uses **the English language analyzer**, but we can override this behavior by setting the property `language` at database initialization, and importing a custom stemmer.

```javascript copy
import { create } from "@orama/orama";
import { stemmer, language } from "@orama/stemmers/italian";

const db = create({
  schema: {
    author: "string",
    quote: "string",
  },
  components: {
    tokenizer: {
      stemming: true,
      language,
      stemmer,
    },
  },
});
```

Right now, Orama supports 30 languages and stemmers out of the box:

- Arabic
- Armenian
- Bulgarian
- Chinese (Mandarin - stemmer not supported)
- Danish
- Dutch
- English
- Finnish
- French
- German
- Greek
- Hindi
- Hungarian
- Indonesian
- Irish
- Italian
- Mandarin (stemmer not supported)
- Nepali
- Norwegian
- Portuguese
- Romanian
- Russian
- Sanskrit
- Serbian
- Slovenian
- Spanish
- Swedish
- Tamil
- Turkish
- Ukrainian
